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_  Abstract 

The  method  of  multiple  replicates  is  frequently  used  by  simulators  to  estimate  the 
steady-state  mean  of  a  stochastic  simulation.  One  important  advantage  of  this  approach 
is  that  it  is  easily  adapted  to  a  parallel  computer.  Unfortxmately,  the  method  of  multiple 
replicates  is  quite  sensitive  to  contamination  by  initial  bias.  In  this  paper,  a  new  type  of 
sampling  plan  is  described.  It  retmns  the  replication  flavor,  yet  attenuates  the  bias  prob¬ 
lem.  It  is  shown  that  the  new  method  reduces  mean  square  error  relative  to  conventional 
multiple  replicates  for  problems  in  which  the^^initial  transient”^ecays  slowly. 


Keywords:  Simulation,  replication,  mean  square  error,  parallel  computation. 
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Introduction 

Let  K  =  (K(n) :  n  >  0)  be  a  real-valued  stochastic  sequence  corresponding  to  the  output 
of  a  stochastic  simulation.  We  assume  that  Y  is  ergodic,  in  the  sense  that  there  exists  a 
finite  (deterministic)  constant  r  such  that 


n— 1 


1  ^  * 


imO 


as  n  oo.  The  steadv-state  simulation  problem  concerns  the  question  of  estimating  the 
pcirameter  r  efficiently,  and  providing  confidence  intervals  for  r. 

Basically,  two  alternative  approaches  for  dealing  with  this  problem  have  been  studied 
in  the  literature.  One  approach  is  known  as  the  method  of  miiltinie  replicat*^.  The  idea 
here  is  to  generate  m  independent  replicates  of  the  process  Y.  Each  replicate  is  simulated 
for  t  time  units.  The  advantage  of  this  method  is  that  it  gives  rise  to  independent  ob¬ 
servations;  this  significantly  simplifies  the  problem  of  producing  confidence  intervals  for 
r.  Furthermore,  given  access  to  a  parallel  computing  environment,  one  can  assign  each 
independent  replicate  to  a  different  processor.  Thus,  the  method  of  multiple  replicates  is 
well  suited  to  parallel  computation. 

A  disadvantage  of  this  approach  is  that  each  of  the  m  independent  replicates  is  con¬ 
taminated  by  initial  bias.  This  initial  bias  arises  from  the  fact  that  each  of  the  m  replicates 
is  initiated  with  an  initial  condition  that  is  atypical  of  the  steady-state  of  the  system.  If 
we  view  the  first « time  units  of  each  replicate  as  representing  an  "initial  transient”  for  the 
system,  this  analysis  suggests  that  nu  time  units  of  the  total  time  simulated  are  contami¬ 
nated  by  initial  bias.  If  m  is  large,  we  find  that  the  method  of  multiple  replicates  devotes 
a  significant  amoimt  of  computation  to  generation  of  highly  biased  observations.  This  is, 
of  course,  xmdesirable. 

In  response  to  this,  we  can  consider  sampling  plans  in  which  only  one  observation  of 
Y  is  generated.  Such  a  strategy  is  known  in  the  literature  as  a  single  replication  method. 
Here,  only  the  first  «  time  units  of  the  simulation  are  significantly  biased,  and  there  is  no 
magnification  effect  by  the  parameter  m.  On  the  other  hand,  construction  of  confidence 
intervals  for  r  is  now  complicated  by  the  fact  that  all  the  observations  collected  are  au- 
tocorrelated.  Furthermore,  it  is  now  a  non-trivial  task  to  make  an  assignment  of  parallel 
processors  that  will  significantly  speed  up  the  simulation. 

Note  that  the  method  of  multiple  replicates  involves  factoring  a  computer  time  budget 
T  into  m  replicates,  each  of  length  t  =  T/m.  If  we  view  the  data  of  the  »’th  replicate  as  being 
assigned  to  the  «*th  row  of  a  matrix,  we  obtain  a  rectangular  mxt  matrix  which  summarizes 
the  data  generated  by  the  simulation.  Consequently,  we  refer  to  the  method  of  multiple 
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replicates  as  a  rectangular  sampling  plan  for  estimating  steady-state  means  (see  Figure  1). 
Of  course,  a  single  replicate  method  is  the  special  case  of  a  rectangular  scheme  in  which 
the  data  corresponds  to  a  l  x  T  row  vector. 

In  this  paper,  we  consider  these  rectangular  methods  in  greater  detail.  We  also  pro¬ 
pose  and  analyze  a  new  non-rectangular  sampling  scheme,  which  attempts  to  offer  an 
advantageous  compromise  between  the  methods  of  single  and  multiple  replicates. 

The  organization  of  this  paper  is  as  follows.  Section  2  provides  reasonably  complete 
mean  square  error  analysis  of  conventional  rectangular  sampling  plans.  In  Section  3,  the 
non-rectangpilar  plan  is  introduced  and  studied.  Section  4  offers  some  conclusions. 


2.  Rectangular  Sampling  Plans 

We  start  by  describing  the  traditional  method  of  replication  for  solving  the  steady- 
state  simulation  problem.  To  simplify  the  discussion  that  follows,  we  will  assiime  that  in 
X  \mits  of  computer  time,  precisely  z  time  units  of  the  process  Y  can  be  simulated.  Th\is, 
given  a  total  computer  time  budget  of  size  T,  we  can  implement  a  rectangular  sampling 
plan  in  the  following  way: 

1. )  Choose  the  number  m  of  independent  replicates.  (If  m  =  i,  this  is  a  single  replication 

method.) 

2. )  Choose  the  (deletion)  parameter  •,  from  the  interval  (O.r/m).  (The  first  «  time  units  of 

each  replication  will  be  deleted  from  the  set  of  observations.) 

3. )  Generate  m  independent  copies  Yi,Y3,...,Ym  of  the  process  Y.  Each  copy  is  simulated 

over  the  interval  [0,  T/m]. 

4. )  Set  t  =  [T/mJ  and  compute  the  estimator 


We  will  now  consider  the  mean  square  error  (MSE)  of  the  estimator  Y(m,B,T).  The 
MSE  criterion  is  often  viewed  as  the  most  important  quantitative  measure  of  the  quality 
of  an  estimator.  We  start  with  the  well  known  MSE  decomposition  formula 


(2.1)  MSE(f  (m,*,r))  *  wF(m,#,r)  -I-  (bias 

By  usirg  the  independence  of  the  replicates,  we  observe  that 


(2.2) 


var  ?(»»», 


1 

—  var 
m 


1 

*-• 


E  >'«• 


y-»+i 


(2.3) 


biasf(m,a,r)*j^  ^  EYlj)-r. 


i-.+i 
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A  Rectangular  Sampling  Plan 
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Figure  1 


The  Non-rectangular  Sampling  Plan 
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Figure  2 
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In  order  to  analyze  the  terms  appearing  on  the  right-hand  sides  of  (2.2)  and  (2.3),  we 
will  assume  that  y(n)  can  be  expressed  as  a  real-valued  functional  of  a  time-homogeneous 
Markov  chain  X{n),  so  that  Y{n)  =  /(X(n))  for  some  real-valued  /  defined  on  the  state  space 
5  of  X.  The  set  5  may  be  discrete  or  continuous.  Continuotis  state  space  is  particularly 
convenient  in  analysb  of  discrete-event  simulations.  The  generalized  semi-Markov  process 
(GSMP)  view  of  discrete-event  systems  shows  that  very  general  discrete-event  simulations 
may  be  expressed  in  the  form  y(n)  =  /(Jr(n))  with  X  Markov,  provided  that  we  permit 
continuous  state  spzice. 

For  xeS,  u  >  1,  let  v(z,  u)  be  the  conditionad  varismce  defined  by 


v(x,  u)  =  E 


w  =  «|- =  ■ 


Similarly,  let  b(x,  u)  be  the  conditional  bias  given  by 


6(*.u)  =  i;(i£V(y)iJ:(o)  =  *|-r. 

I 


Let  fi(  )  =  P{X(0)e  }  be  the  initial  distribution  of  X.  The  Markov  property  permits  us  to 
re-express  (2.3)  as 

(2.4)  bias  f(m,3,T)  =  E^6(X(s  +  !),<-  a), 

where  E^(  )  denotes  the  expectation  operator  conditional  on  Jr(0)  having  distribution  m. 

To  obtain  a  similar  expression  for  the  variance  term  (2.2)  requires  more  care.  We  first 
apply  the  well  known  variance  decomposition  formula 

var^  E  =  ^  y (j) |Ar(a  +  1)  [ 

(2.5)  ^  ^ 

+  ^  y(j)|A-(a-H)|. 


Clearly,  we  have 


E  ny)W.  +  l)|=t;(X(a  +  l),t-a), 

I  ^  E  + 1)  I  =  W*  +  1).  *  - '). 


Plugging  these  expressions  into  (2.5)  yields 


,  * 

var-!-  E  y(y)  =  i?M»(.^(-  +  l).*-*)  +  var^Wa-l-l),t-a). 
•  **  ^ 
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where  varj,(  )  denotes  the  variance  operator  conditional  on  X(0)  having  distribution  n. 
Suppose  that  X  is  a  positive  recurrent  Markov  chain  possessing  a  unique  invariant 
probability  distribution  A  large  class  of  such  chains  has  the  property  that  under  suitable 
regularity  conditions, 

sup  -  ^,A(A'(0))|  =  0(«-“*) 

htM 

for  some  a  >  0,  where  )i  is  some  appropriately  defined  family  of  real-valued  functions 
h:  S  -*  Si.  (See  NUMMELIN  (1984),  p.  120,  for  an  example  of  such  a  theorem.)  Assuming 
that  the  functions  v(-,u),  b^{-,u)e)l  for  all  u  >  1,  we  obtain  the  relations 

(2.7)  E^v[X{s  +  l),t  -  5)  =  .E,V(X(0),t  -  a)  +  0{«-“‘), 

(2.8)  EJ{X{a  +  !),»-«)  =  E,b{X{0),t-  «)  +  Ole""), 

(2.9)  E^b^{X{B  +  l).t  -  •)  =  E,b^{X{0),  t-a)  +  0(e-“*), 

where  the  constants  implicit  in  each  of  the  “big  Oh”  terms  are  independent  of  t. 

Furthermore,  for  such  a  recurrent  Markov  chain,  it  is  typically  the  case  that  the  steady- 
state  mean  r  can  be  expressed  in  the  form  r  =  £,/(X(0)).  As  a  consequence  of  the  station- 
arity  of  X  tinder  initial  distribution  ir,  it  is  evident  that  E,Y(n)  =  r  for  n  >  0  and  hence 
E,b(X(o),t-  a)  =  0.  Thus,  (2.8)  can  be  simplified  to 

(2.10)  E^b(X(a  +  l),t-a)  =  0(e-“*). 

Combining  (2.9)  and  (2.10),  we  obtain 

(2.11)  yar^blX(a  +  l),t^a)  =  Enb^(X(0).  t  -  a)  +  0(e-«*). 

(Agdn,  the  constants  implicit  in  (2.10)  and  (2.11)  are  independent  of  t.) 

Combining  (2.6),  (2.7),  and  (2.11),  we  obtain  the  expression 

T  y(y)  =  .B,»(X(0),t-.)  +  ^6»(jr(0),t-a)  +  O(e-«). 

*  '  i-.+l 

Repeating  the  variance  decomposition  (2.6)  under  var,r(-),  we  find  that 

var,-^  2  + 

*  '  /-.+i 

and  hence 

(2.12)  i  ^  TO+o(e— ). 

>»#+!  y**+i 
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To  simplify  (2.12),  we  again  use  the  fact  that  X  is  stationary  xmder  initial  distribution 
■K.  Set  yj:(n)  =  y(n)  -  r, 

<t3  =  EMQ)'" + 2  f;  £?,n(o)n(*) 

ksl 

r,^2^kE,Y,(Q)Y,[k). 

k-t 


Under  appropriate  summability  hypotheses  (see,  for  example,  p.  172  of  BILLINGSLEY 
(1968)),  we  can  use  the  stationarity  to  write  _ 

(2.13)  KOI .  ^  ^  ^  B,n(o)n(t). 

j=«+l  '  '  k^t^$  '  ' 


Note  that 


(2.14)  |E,yc(0)n(fc)|  <  /  !/(*)! .  |£?.n(fc)|  •  r{dx), 

Js 

where  E,{  )  is  the  expectation  operator  conditional  on  X(0)  =  z.  We  now  observe  that 
E,Y{k)  =  E,f{X{k))-Enf{X{0)).  Appropriate  regularity  hypotheses  on  X  permit  us  to  assert 
that 

(2.15)  flip  |i:./(X(A:))  -  .B./(Jr(0))|  =  0(e-^‘) 

•«s 

for  some  /?  >  0.  (See  p.  122  of  NUMMELIN  (1984)  for  a  typical  such  result.)  Substituting 
this  relation  in  (2.14)  yields 

f?,yc(0)n(*)  =  0(e-^*). 

We  may  therefore  conclude  that 

(2.16)  £('"!)  ^-5;(0)y.(*)  =  0(e-^'“) 
for  0<P'  <  p.  Substitution  of  (2.16)  into  (2.13)  shows  that 

(2.17)  ^ + o(.-'-'-'). 

Combining  (2.1),  (2.2),  (2.4),  (2.10),  (2.12),  and  (2.17),  we  obtain  the  important  relar 
tionship 

(2.18)  MSE(P(m,.,T))  =  ^  +  0(.— )  +  io(«-''(— 1). 

where  the  implicit  constants  appearing  in  the  "big  OH"  terms  are  independent  of  m, «,  and 
T. 
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To  gain  further  insight  into  (2.18),  we  consider  the  typical  situation,  in  which  the 
deletion  point  a  is  small  relative  to  the  length  t  of  each  replicate.  Furthermore,  in  order  to 
simplify  the  discussion,  we  assume  that  mt  =  T  (exactly).  Then, 


(2.19) 

1  a^ma  1  ^  > 

mt-a  T  ■  ja  T  \  T»  ) 

(2.20) 

1  »j  m*7  m  fma 

m{t-a)^  V  T  ■ 

Combining  (2.18)  through  (2.20),  we  obtain  the  approximation 

(2.21)  MSE(r(m, D)  »  ^ 

Viewing  5  and  m  as  design  parameters  for  the  simulation,  we  see  that  (2,21)  suggests 
that  the  deletion  parameter  a  should  be  small.  On  the  other  hand,  if  a  is  chosen  too  small, 
difficulties  cam  airise  in  the  “big  Oh”  terms  appeairing  in  (2.18).  This  recommendation 
corresponds  to  intuition. 

As  for  the  number  of  replications  m,m  should  be  chosen  small  (for  exaunple,  a  single 
replicate  method  should  be  considered)  whenever  a^a  >  ri.  For  reasonable  values  of  a,  this 
inequality  will  typically  be  valid.  Thus,  meam  squaire  error  favors  using  a  small  number  of 
replicates.  This  differs  from  the  conclusion  reaudied  by  KELTON  (1986)  in  his  amadysis  of 
“replication  splitting”  schemes  for  simulation  of  autoregressive  sequences.  The  arguments 
there  show  that  using  a  large  number  of  replicates  can  reduce  the  variance  of  the  steady- 
state  estimator  when  the  autoregressive  sequence  is  positively  correlated  (i.e.  n  >  o). 
In  our  current  setting,  we  judge  our  estimators  via  mean  square  error  (as  opposed  to 
variamce).  Since  our  error  criterion  explicitly  considers  the  loss  in  estimator  efficiency  due 
to  bias  (vairiamce  does  not  measme  bias),  it  is  not  surprising  that  o\ir  conclusions  differ. 
Of  course,  if  a  is  small  (i.e.  bias  is  not  a  major  problem),  (2.21)  supports  using  a  large 
number  of  replicates  when  i;  >  0, 

To  illustrate  the  above  points,  we  calculate  the  mean  square  error  of  7{m,a,T)  when 
m  =  1^(0  <  p  <  1)  and  a  -  T*{0  <  q  <  l  -  p),  in  which  case  t  —  T^,  where  r  =  i  -  p.  We  find  that 

(2.22)  MSE(F(m,,,r))  =  ^  +  +  o(r»'^»*-»). 

Assuming  that  p-«-  9  <  1/2,  (so  that  the  “big  oh”  term  is  small)  we  find  that  relation  (2.17) 
confirms  the  previous  discussion.  Both  p  and  q  should  be  chosen  small,  in  accordance  with 
our  previous  recommendations. 

3.  A  Non-Rectangular  Sampling  Plan 

The  idea  behind  the  sampling  plan  to  be  described  in  this  section  is  that  we  try 
to  avoid  expending  a  significant  fraction  of  the  computer  time  budget  on  generation  of 


8 


highly  biased  observations.  As  discussed  in  the  Introduction,  the  initial  bias  problem  is 
of  particular  concern  when  the  method  of  multiple  replicates  is  used,  since  the  amount  of 
contaminated  data  is  proportional  to  the  number  of  replicates.  On  the  other  hand,  the 
method  of  multiple  replicates  enjoys  several  significant  advantages:  ease  of  construction 
of  confidence  intervals  and  development  of  parallel  simulation  schemes.  Our  goal  here  is 
to  develop  a  method  that  has  a  multiple  replicate  flavor  and  yet  avoids  the  initial  bias 
difficulties  that  are  associated  with  conventional  multiple  replicate  methods. 

As  in  Section  2,  we  assume  that  the  output  sequence  Y  takes  the  form  y(n)  =  /(X(n)) 
for  some  time-homogeneous  Markov  chain  X,  and  real-valued  function  /.  The  following 
algorithm  employs  one  simulation  of  length  «  to  generate  an  initial  condition  which  is 
reasonably  typical  of  the  steady-state.  This  initial  condition  is  then  used  to  generate  m 
conditionally  independent  replicates  (each  of  length  <)  from  the  output  sequence  Y.  Thus, 
the  effort  to  generate  a  “good”  initial  condition  is  amortized  over  the  m  replicates.  In 
terms  of  observations  generated,  this  sampling  plan  is  non-rectangular  (see  Figure  2). 

The  non-rectangular  sampling  plan  can  be  summarized  as  follows. 

1. )  Given  the  computer  time  budget  T,  choose  the  number  m  of  (conditionally  independent) 

replicates,  and  the  deletion  parameter  •  (0  <  «  <  T). 

2. )  Generate  one  copy  Yo  of  the  sequence  Y  to  time  #. 

3. )  Using  the  initial  condition  Xo(«)  (Xo  is  the  Markov  chain  corresponding  to  Ko),  generate 

m  copies  yi,...,y„  of  Y  to  time  t- 1,  where  t=  [(T-s)/m]. 

4. )  Compute  the  estimator 

tsl  jsO 

We  now  turn  to  computing  the  mean  square  error  of  Y(m,»,T).  As  in  Section  2, 

(3.1)  MSE(y(m,»,r))  =  v«rF(m,*,  r)  -I-  (bias  Y(m,t,T))^. 

Using  the  fact  that  y;(  )=y(-  +  «)  (=  denotes  equality  in  distribution),  we  find  that 

bias  Y(m,t,T)  =  £?^6(X(»),t). 

From  (2.8),  it  therefore  follows  that 

(3.2)  biasy(m,#,r)  =  0(e— ). 

To  handle  the  variance  term  appearing  on  the  right-hand  side  of  (3.1),  we  again  use  the 
variance  decomposition  method: 

(3.3)  vary(m,«,r)  =  var.E{F(m,«,r)|Xo(«)} -I- .Evar{F(m,«,r)|Xo(a)}. 
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It  is  easily  seen  (use  the  fact  that  Yi . Y„  are  independent  and  identically  distributed, 

conditional  on  Ab(4))  that 

(3.4)  =  a^i., 

(3.5)  vtt{y(m.«,r)|Jro(-)}=-t;(Jfo(5),t)  aji. 

in 

Combining  (3.3)  through  (3.5),  we  get 

(3.6)  vary'(m,«,r)  =  — £',4i;(X(«),t)  +  var,»i(Jf(«),t). 

tn 

As  in  Section  2,  we  obtain 

(3.7)  var  Y{m,  a,  T)  =  -f?,w(A-(0),  t)  +  .E,6*(X(0),  t)  +  0(e—) 

m 

(use  (2.7),  (2.8),  and  (2.9)).  Recall  that 

E  )  =  -^-"(^(0).  *)  +  ^-*"(^(0).  0- 

y=o 

(see  Section  2).  Plugging  into  (3.7),  we  get 

(3.8)  v«  y (m.  a.  T)  =  Ivar.l  ^  y (,)  +  0^)  ^»6*(^(0),  *). 

imO  '  ' 

The  first  term  on  the  right-hand  side  of  (3.8)  was  analyzed  in  (2.17).  For  the  second  term, 
note  that 

4(*.*)  =  l6(*)-if;(£?.y(*)-r), 

*  Jkst 

where 

6(*)  =  f;(^.y(*)-r). 

kmO 

From  (2.15),  it  is  evident  that 

(3.9)  fnp|6(x,t)- j5(i)|  =  0(e-^). 

Consequently,  we  obtain  the  inequality 

(3.10)  HX{0),t)  <  i6(X(0))  +  0{e-f*). 

Since  E^Y^k)  »  r,  the  expectations  ^,i(X^(0),t)  and  Eni(X(0))  both  vanish.  From  (3.10), 
we  therefore  get 

E,i^(X(0),  t)  <  ^E,h^(X{0))  +  O(e-^)£;.|6(A-(0))|  +  0(*-3'‘). 
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A  similarly  derived  lower  bound  yields  the  formula 


(3.11)  Enb^{X{0),  t)  =  +  0(«-^‘). 

Let  b  =  f?,6’(Jf(0)).  To  simplify  the  following  discussion,  assume  t  =  {T  -  a)/m  (exactly). 
Combining  (2.13),  (3.8),  and  (3.11),  we  obtain  the  important  relationship 

(3.12)  MSE(i-(m,., D)  =  i  -  i)  +  i  +  0(.-«)  +  0(.-"), 


where  t  and  the  (implicit)  constants  in  the  “big  oh”  terms  are  independent  of 

m,s,  and  T.  Expressin 

(3.13) 

(3.14) 

(3.15) 


in  terms  of 

m,  a 

,  and  T 

',  we 

ff.  - 

<7^ 

a^a 

iol 

mt 

T  ' 

^  'n  ^ 

n  ^ 

mri 

+  ^o 

(-) 

mt^ 

\t) 

/m  —  l\  b 

,  On 

—  I)m6 

V  m  )t^~ 

T2 

+  — 

assuming  that  a  is  small  relative  to  T.  Combining  (3.12)  through  (3.15),  we  obtain  the 
approximation 


(3.16) 


MSE(f(m,«,r))« 


T  T3 


mri  .  m(n  —  1)6 

7^  i>3 


We  now  compare  the  mean  square  error  of  our  non>rectangular  sampling  plan  with  that 
of  a  rectangular  plan  having  the  same  computer  time  budget  T,  number  of  replications 
m,  and  deletion  parameter  a.  Comparing  (3.16)  to  (2.21),  we  see  that  MSE(y(m,5,r))  < 
MSE{y(m,a,r))  when 

o^ma  >  <7^ a  +  6(m^  —  m). 


We  shall  shortly  show  that  6  >  Thus,  y’(m,a,r)  beats  T[m,a,T)  when  am  >  «  +  m*.  This 
will  typically  occur  when  a  is  large  relative  to  m.  Thus,  we  can  expect  f’{m,a,T)  to  have 
smaller  MSE  than  F(m,a,r)  whenever  a  must  be  chosen  relatively  large,  in  order  to  remove 
initial  bias. 

We  can  illustrate  this  point  when  m  *  T’*  (0  <  p  <  l)  and  a  =  T*  (0  <  g  <  1).  Then,  if 
p  +  q<  1/2, 

(3.17)  MSE(F(m,a,r))  =  ya-a  “  j^-p  ys-sp  ^  ’ 

Comparing  (3.17)  to  (2.22),  we  find  that  MSE(f‘(m,a,I^)  <  MSE(r(fn,a,r))  when  p  <  9, 
as  was  suggested  above. 
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We  conclude  this  section  by  showing  b  >  a^.  We  first  observe  that  6(i)  solves  Poisson’s  j 
equation 

b(x)-E,h(X(\))  =  Uix), 
where  /«(*)  =  f[x)  -  r.  Additionally,  S,b(X(0))  =  0.  Then, 

n— 1  n+l 

E  *  E  +  M-X'(O))  -  b(X(n  +  1)) 


where  Dj,  -  b(X(k))  -  E{b(X(k))lX(k  - 1)}  are  martingale  differences.  Note  that  if  X(0)=ir,  we 
can  apply  the  martingale  central  limit  theorem  (see  p.  205  of  BILLINGSLEY  (1968))  to 
conclude  that 

(3.18)  n-»/»  2  /«('*'(*))  =►  ^^(0. 1) 

iksO 

where  =  £,D^.  (The  function  b(  )  is  bounded  under  (2.15).)  If  the  left-hand  side  of  (3.18) 
is  appropriately  uniformly  integrable,  then 


(3.19) 


n-W,^/e(A-(ifc))-A» 


fcsO 


as  n  — ♦  oo.  But 


E  /« W*))  =  j}  Y(j). 

fc=o  y=o 

From  (2.17)  and  (3.19),  it  follows  that  A’  =  Enl>i  =  But  P,  is  orthogonal  to  i(Af(o)), 
being  a  martingale  difference,  and  hence 


£:,6(X(1))»  = 


Since  b  =  Ffb(X(0))^,  it  is  evident  that  b>a^. 


4.  Conclusions 

The  non-rectangular  sampling  plan  introduced  in  this  paper  has  a  lower  mean  square 
error  than  that  of  the  corresponding  rectangular  plan  that  involves  an  equivalent  amount  of 
computer  time,  when  the  ’‘initial  transient”  decays  slowly.  This,  of  cotirse,  is  precisely  the 
setting  in  which  the  method  of  multiple  replicates  exhibits  its  poorest  behavior  (relative  to 
a  single  replicate  method).  Thus,  the  non-rectangular  plan  described  here  is  most  beneficial 
in  precisely  those  problems  for  which  multiple  replicates  is  typically  most  ineffective. 

It  should  be  clear  that  the  replication  component  of  this  non-rectangular  plan  is  well- 
suited  to  parallel  computation.  However,  the  generation  of  the  initial  condition  Xo(5)  is 
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not  easily  adapted  to  the  parallel  setting.  This  aspect  of  the  sampling  plan  described  here 
deserves  further  attention. 

Finally,  it  should  be  mentioned  that  a  great  deal  of  empirical  work  remains  to  be  done 
in  understanding  the  advantages  and  limitations  of  this  non-rectangular  method,  when 
applied  to  “real  world”  problems. 
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